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(a) 

<! — Description of application data — > 

<ProductDescription> 

IBM personal computer 

</ProductDescription> 

<ProductCode pcode="5550" ccode="IBM" /> 

<! — description of error correction information for application data — > 

<ec:ecc ID="01"> 

<ec:div vai_ec="7C21 E6237893963EC25D23A84952C21 E"> 
<ec:target ec:path="//ProductDescription/text()'7> 
<ec;target ec:path='7 /ProductCode/@pcode'7> 
<ec:target ec;path='7 /ProductCode/@ccode'V> 
</ec:div> 
</ec:ecc> 



(b) 

<ec:ecc> 

<ec:div val_ec="9E01 E6237AB3952F7C21 E623D73223AB") 

<ec:target ec:path=7/ecc[ID="01 "]/div//@*'7> 
</ ec:div> 
</ec:ecc> 



Fig. 5 
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(a) 

<ec:word type="ProperNoun"> Suzuki Ichiro </ec:word> 
<ec:word type="Abbreviation">XML</ec:word> 
<ec:word type="TagName">ProductCode</ec:word> 
<ec.word type="AttName">ccode</ec:word> 



(b) 



type value 


Definition 


ProperNoun 


Proper noun 


Abbreviation 


English abbreviation 


TagName 


Tag name 


TagVal 


Keyword that appears as an element value 


AttName 


Attribute name 


AttVal 


Keyword that appears as an attribute value 


(c) 



<ec:word type="ProperNoun"> 

<ec:div valec="7A4D273387C2971 FB22E927D74267B2A"> 
Suzuki <ec:ch utf="x004E">lchi </ec:ch>ro</ec:div> 

</ec:word> 
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Read XML application data 



Insert error correction information 
concerning text in element 



S101 



S102 



Insert error correction information concerning a character 
string that indicates attribute name or value 



Add error correction information using XPath designation 



Add information concerning contents of target data 



Replace a character or a blank that tends to be misread 



Output application data with correction information 



S103 



S104 



S105 



S106 



S107 



Fig. 8 
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Read intermediate OCR results 


-\-S201 




r 






Context process using minimum word set 


-~^S202 




r 






Extract text segment wherein information concerning 
contents of target data is written 


--\^S203 




r 








Process for text with error detection/correction snformatton 




'~\^S204 




r 






Extend word set 


^^S205 




r 






Context process using extended word set 


-~-^S206 
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Process for tBxt using error detection/correction information 




^-S207 



Fig. 9 
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Fig. 11 



JP9 - 2000 - 0267 - JP1 
12/14 



<PurchaseOrder> 

<t — Book order information — > 

<Buyer> 

<Name> Nippon Taro </Name> 

<CreditCard>9880<e^^ 
£^x^/T>-</ea*<7/7>2378</CreditCard> 
</Buyer> 
<OrderUst> 

<Order> 

<\$BH>4<ec:ch utf="xOdfT>-</ec:chm§\ <ec:ch utf="xOdff">- 
</ec:ch)&m<8c:ch uifc"xOdff">-</ec:ch>\ </ISBN> 
</Order> 
<Order> 

<ISBN>4<ea*c/> utfc"x0dff''>~</ec:ch>Z14<ec:ch utfc"xOdff">~ 
</ec:chmi*fo <ec:ch utfr"xOdfr>-</ec:ch>2<ASBU> 

</Order> 
</OrderList> 

<! — Signature information — > 
<Signature> 
<SignedInfo> 
<Transforms> 

<Transform Aigorithm="http://www.w3,org/TR/1 999/REC-xpath 
-19991116"> 

<XPath "> 

(Buyer | OrderList) 
</XPath> 
</Transform> 
</Transforms> 
</SignedInfo> 

<DigestMethod Algonthm= // http://www.w3.org/2000/07/xrrildsig#sha 1 "/> 
<DigestValue>Xy6%Dgdeu256& t fdi</DigestValue> 
<SignatureValue>op6&se%$h78s1Wq*ae</SignatureVaiue> 
</Signature> 

<! — Description of error correction information — > 
<ec:ecc ID="01"> 

<ec:divvaLec=~7123E6237893963EC25D23A84952C21E"> 
<ec±arget ecipath-" //Buyer/Name" /> 
<ec$arget ec:path='V/Signautre//textO" /> 
</ec:div> 
</ec:div> 

</PurchaseOrder> 



Fig. 12 
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<report> 

<author> Suzuki Ichiro </author> 

<tit)e>New technique for data exchange </title> 

<chapter> 

<p> this report.. </p> 
</chapter> 
<chapter> 

<p>RosettaNet specifies new standards called PIP... </p> 
</chapter> 

<ac:word type="ProperNoun "> Suzuki Ichiro </ec:word> 
<ec:word type="ProperNoun"> RosettaNet </ec;word> 
<ec:word type- "Abbreviation ">PIP</ec:word> 

</ report> 



